Zero Shot Classification (Generative Models)
Synopsis
Applies a Zero Shot Classification model for a given promptDescription
Applies a Zero Shot Classification model for a given prompt. These models are text classification models but without being trained on predefined classes. Instead, the classes are only defined while the prompt is sent to the model. You could for example send the text “My laptop is broken, I need a replacement asap” together with the classes “urgent” and “not urgent” to the model and would get the result “urgent”. But you can also send the same text with the classes “software” and “hardware” and would get the result “hardware” – from the same generic model. Prompts can refer to data columns by putting them into double brackets, for example [[column_name]]. The possible classes need to be specified in a data column, all class values separated by the pipe symbol “|”. The predictions as well as the scores for each of the classes will be stored in two new columns as a result of this operator.Input
- data (Data table)
The data which will be injected into the prompt. Column values of the data set can get accessed with [[column_name]] in the prompt.
- model (File)
The optional model directory (in your project / repository or your file system). Has to be provided if the parameter "use local model" is true. Typically, this is only necessary if you want to use your own finetuned local version of a model.
Output
- data (Data table)
The input data plus a new column (or several ones) which are the result of the prompts sent to the model.
- model (File)
The model directory which has been delivered as input.
Parameters
- use local model Indicates if a local model should be used based on a local file directory or if a model should be used from the Huggingface portal. If a local model is to be used, all task operators require a file object referencing to the model directory as a second input. If this parameter is unchecked, you will need to specify the full model name coming from the Huggingface portal for the “model” parameter.
- model The model from the Huggingface portal which will be used by the operator. Only used when the “use local model” parameter is unchecked. The model name needs to be the full model name as found on each model card on the Huggingface portal. Please be aware that using large models can result in downloads of many gigabytes of data and models will be stored in a local cache.
- name The name of the newly generate column with the predictions.
- prompt The prompt used for querying the model. Please note that you can reference the values of any of the input data columns with [[column_name]]. You may need to use a prompt prefix such as “Translate to Dutch: [[column_name]]” to tell the model what it is supposed to do.
- classes from data Indicates if the class candidates are taken from a data column or are defined by the classes parameter of this operator. If you define the classes with a data column that means that you can use different classes for each row. If you define the classes with the parameter that means that the class candidates are the same for all rows which is equivalent to a regular text classification task then.
- classes column Only used when the classes are derived from a data column. The data column containing the class candidates for each row. Classes need to be separated by the pipe symbol "|". They can be different for each row.
- classes Only used when the classes are not different for each row and hence are not derived from a data column. In this case you can define a set of class candidates which are used for all rows of your data which turns this into a regular text classification operator. Classes need to be separated by the pipe symbol "|".
- device Where the finetuning should take place. Either on a GPU, a CPU, or Apple’s MPS architecture. If set to Automatic, the training will prefer the GPU if available and will fall back to CPU otherwise.
- device indices If you have multiple GPUs and computation is set up to happen on GPUs you can specify which ones are used with this parameter. Counting of devices starts with 0. The default of “0” means that the first GPU device in the system will be used, a value of “1” would refer to the second and so on. You can utilize multiple GPUs by providing a comma-separated list of device indices. For example, you could use “0,1,2,3” on a machine with four GPUs if all four should be utilized. Please note that RapidMiner performs data-parallel computation which means that the model needs to be small enough to be completely loaded on each of your GPUs.
- confidence digits The number of digits the confidence values are rounded to. Only used when the classes are derived from a data column.
- hypothesis template The template used to turn each label into an NLI-style hypothesis. This template must include a {} or similar syntax for the candidate label to be inserted into the template. For example, the default template is "This example is {}."
- multi label Whether or not multiple candidate labels can be true. If “false”, the scores are normalized such that the sum of the label likelihoods for each sequence is 1. If “true”, the labels are considered independent and probabilities are normalized for each candidate by doing a softmax of the entailment score vs. the contradiction score.
- data type Specifies the data type under which the model should be loaded. Using lower precisions can reduce memory usage while leading to slightly less accurate results in some cases. If set to “auto” the data precision is derived from the model itself.
- revision The specific model version to use. The default is “main”. The value can be a branch name, a tag name, or a commit id of the model in the Huggingface git repository.
- trust remote code Whether or not to allow for custom code defined on the Hub in their own modeling, configuration, tokenization or even pipeline files.
- conda environment The conda environment used for this model task. Additional packages may be installed into this environment, please refer to the extension documentation for additional details on this and on version requirements.
Tutorial Processes
Using a zero shot classification model
This tutorial shows how to use a zero shot classification model. It creates some prompts and feeds them into the task operator. Please note that the "classes" column defines different classes for each of the rows. You could also use the same classes for all the rows by disabling the "classes from data" parameter and definining the "classes" parameter instead.